Principal component analysis for authorship attribution
نویسندگان
چکیده
Recieved: 18, November, 2011 Revised: 19, June, 2012 Accepted: 22, July, 2012 Citation: Jamak, A., Savatić, A., Can, M. (2012). “Principal component analysis for authorship attribution”, Business Systems Research, Vol. 3, No. 2, pp. 49-56. DOI: 10.2478/v10305-012-0012-2
منابع مشابه
Who Wrote this Novel? Authorship Attribution across Three Languages
Based on different writing style definitions, various authorship attribution schemes have been proposed to identify the real author of a given text or text excerpt. In this article we analyze the relative performance of word types or lemmas assigned to represent styles and texts. As a second objective we compare two authorship attribution approaches, one based on principal component analysis (P...
متن کاملAuthorship Attribution Using Principal Component Analysis and Competitive Neural Networks
Feature extraction is a common problem in statistical pattern recognition. It refers to a process whereby a data space is transformed into a feature space that, in theory, has exactly the same dimension as the original data space. However, the transformation is designed in such a way that the data set may be represented by a reduced number of "effective" features and yet retain most of the intr...
متن کاملAn experiment in authorship attribution
This paper reports an experiment in authorship attribution that reveals considerable authorial structure in texts written by authors with very similar background and training, with genre and topic being strictly controlled for. We interpret our results as supporting the hypothesis that authors have ’textual fingerprints’, at least for texts produced by authors who are not consciously changing t...
متن کاملAuthorship Attribution: A Comparative Study of Three Text Corpora and Three Languages
The first objective of this paper is carry out three experiments intended to evaluate authorship attribution methods based on three test-collections available in three different languages (English, French, and German). In the first we represent and categorize 52 text excerpts written by nine authors and taken from 19th century English novels. In the second we work with 44 segments from French n...
متن کاملQuel est l'auteur de ce roman ?
In this paper, we present the authorship attribution problem. As text representation, recent studies suggest using a small set of function or very frequent words (50 or 100). On this basis, we can apply either the principal component analysis (PCA) or the correspondence analysis (CA) to visualize the relationships between text surrogates. Using the nearest neighbor approach, we can then suggest...
متن کامل